Finite Sample Results

#econometrics #economics

Oh, Hyunzi. (email: wisdom302@naver.com)
Korea University, Graduate School of Economics.
2024 Spring, instructed by prof. Kim, Dukpa.


Main References

  • Kim, Dukpa. (2024). "Econometric Analysis" (2024 Spring) ECON 518, Department of Economics, Korea University.
  • Davidson and MacKinnon. (2021). "Econometric Theory and Methods", Oxford University Press, New York.

Model and Assumptions

The data generating process (DGP) assumes that some scalar random variable is generated by the following model: where and . We assume are unknown.

In matrix notation,

Remark (constant regressor).

We assume that (first) one of the regressors is a constant regressor.

Therefore, the least-squares estimate of is given as which is linear in data .

Unbiasedness of Least-Squares Estimate

Definition (unbiasedness).

A statistic is an unbiased estimator of if .

The unbiasedness of required the following assumptions.

Assumption (A1~A3).
  • A1) , .
  • A2) The model relating and is linear and given by .
  • A3) and .
Proposition (unbiasedness of least-squares estimate).

Under Assumption 3 (A1~A3), the least squares estimates are unbiased.

Proof.Since , we have Since we have , by the law of iterated expectation, we have thus , i.e. is unbiased estimate.

Zero Mean Error and Residual

Remark (zero mean error term).

From A1, we have where the first equation holds by the law of iterated expectations.

Remark (zero mean residual).

From A1, we have thus we have

Uncorrelated Errors and Regressors

Assumption 1

A1 refers to that the errors and regressors are uncorrelated, meaning that unexplained part of the dependent variable is not correlated with the explanatory variables.

The most crucial assumption is the A1, which states that the expected value of the errors conditional on the regressors is zero. If A1 does not hold, then the least squares estimate can be biased.

Remark (uncorrelated error and regressors).

From A1, we have and therefore, meaning that the error term and the regressors are uncorrelated.

Therefore, if , then we have , thus A1 not holding.
From now on, we provide two example where A1 does not hold, and causing an omitted variable problems.

Example (correlation between education level and wage).

Given a cross-sectional regression model person 's innate ability can both affect and . Thus we have thus A1 does not hold, causing be biased.

Example (correlation between fertilizer distributed and crop yield).

Given a cross-sectional regression model However, if the fertilizer is needed to be diluted with water, then If the true data generating process is given as then we have the omitted variable problem, where . Thus, Thus A1 does not hold, causing be biased.

Average Marginal Effects

Remark (average marginal effects).

From A1, we have Taking derivative with respect to , we have implying that the least square estimate is a marginal effect of increase in regressor.

Example (marginal effect with cross term).

From Example 9 (correlation between fertilizer distributed and crop yield), let the regression model be then the marginal effect of the fertilizer is thus the marginal effect depends on the amount of water.

Variance of Least-Squares Estimate

Assumption (A4~A5).
  • A4) , : homoskedasticity in error term.
  • A5) , : independently distributed error term.
Proposition (spherical errors).

Under A1~A5, we have where is matrix.

Proof.Since , we have Therefore, where the fourth equation holds by A4) and A5) .
Then, by previous Remark 5 (zero mean error term) and the law of iterated expectation, Furthermore, which completes the proof.

Lemma (conditional variance of least squares estimate).

Under A1~A5, we have

Proof.From , we have where the third and fourth equations hold by Proposition 13 (spherical errors).

To derive the unconditional variance of the least squares estimates, we first prove the following Lemma 15 (law of total variance).

Lemma (law of total variance).

Let be a random variable scalar and be a matrix of random variables. Then, we have

Proof.First, since is scalar, by the definition, where the second equation holds by the law of iterated expectations.
Similarly, the conditional variance of is, Thus we have Also, note that and

Combining the results, we have Therefore, This completes the proof.

Proposition (unconditional variance of least squares estimate).

Under A1~A5, we have

Proof.Using Lemma 15 (law of total variance), we have since by Proposition 4 (unbiasedness of least-squares estimate), and by Lemma 14 (conditional variance of least squares estimate), we have where the last equation holds as is not a random variable.

Lemma (variance of j-th least squares estimate).

Let be the from a regression of on all regressors except , where the regressors contain at least one constant variable. Now let . Then under A1~A5, we have

Now, let the regression be where and . Then, by the Properties of Least Squared Estimator > Theorem 8 (Frish-Waugh-Lovell Theorem), we have where , , given . Thus the regression is equivalent to Then, we have where the last equation holds by and which is from Properties of Least Squared Estimator > Definition 3 (R-squared).

understanding the variance of the least square estimate

From Lemma 17 (variance of j-th least squares estimate), the conditional variance of can be explained into three components

  • If the variance of the error term increases, then increases.
  • If the data on -th variable has small deviation, then we have high uncertainty on the marginal effect of the -th variable, thus we have high .
  • If the explanatory power without -th variable is already high, then the additional variable has not much direct effect on , thus we have high .

Gauss-Markov Theorem

Assumption (Classic Assumptions).
  • A1) , .
  • A2) The model relating and is linear and given by .
  • A3) and .
  • A4) , : homoskedasticity in error term.
  • A5) , : independently distributed error term.
Theorem (Gauss-Markov theorem).

Given A1~A5, is the Best Linear Unbiased Estimator (BLUE) of , i.e., if is any other linear unbiased estimator, then in the sense that the difference is a positive semi-definite matrix.

Proof.By Lemma 15 (law of total variance), we have Where the second equation holds by the unbiasedness of and . Thus it is sufficient to show that First, since is assumed to be linear estimator, let Then, as is assumed to be unbiased estimator, we have where the second equation holds by the regression equation of the least square estimator. As must hold, we must have Next, as we already have from Lemma 14 (conditional variance of least squares estimate), we now derive . From Thus we have To show that is positive semidefinite, we prove it by showing there exists some matrix such that . From , we have . Therefore, by defining we have Thus , and therefore, Note that the equality holds when , since and therefore , equivalent to the least squares estimate.

Estimate of Variance of Error

If the true errors is given, then we can estimate directly by However, as is unknown in practice, we use proxy variable by replacing them with the estimated residuals.

Definition (estimate of ).

Let be the estimated residuals from the given regression. Then the estimate of is

Proposition (unbiased estimate of ).

Let the estimate of be Then is a biased estimate of . While the unbiased estimate of is

Proof.As , we have Note that Therefore, we have thus is biased estimate.

However, by defining Thus is an unbiased estimate of .

Proposition (correlated error and residual).

The estimated residuals and true error term is correlated by construction.